ftxt_list <- cut(1:length(emma), c(1, grep("^\\s*$", emma), length(emma))) %>%
split(janeaustenr::emma, f = .) %>%
lapply(paste, collapse = " ") %>%
tokenizers::tokenize_words(lowercase = FALSE, strip_punct = FALSE) %>%
as.fulltexttable() %>%
split(ftxt, column = "tag_before", regex = "<para") %>%
retag(regex = "CHAPTER", old = "para", new = "h2") %>%
as.fulltexttable() %>%
split(column = "token", regex = "CHAPTER") %>%
rename(name = sprintf("Chapter %d", seq_along(.)))We introduce the fulltext package by example. In addition to the fulltext package, we need the polmineR package which includes the GERMAPARLMINI corpus.
## ... activating corpus: GERMAPARLMINI
## ... activating corpus: REUTERS
The example aims at outputting one particular speech. We take a speech held by Voker Kauder in the German Bundestag.
sp <- corpus("GERMAPARLMINI") %>%
subset(speaker == "Volker Kauder") %>%
subset(date == "2009-11-10")The data that is passed to the JavaScript that generates the output. Expected to be a list of lists that provide data on sections of text. Each of the sub-lists is to be a named list of a character vector with the HTML element the section will be wrapped into, and a data.frame (or a list) with a column “token”, and a column “id”.
So let us see the result …
ftxt_list <- lapply(
setNames(names(ftxt_list), names(ftxt_list)),
function(chapter) data.frame(ftxt_list[[chapter]], chapter = chapter)
)library(crosstalk)
austen_chapters <- do.call(rbind, ftxt_list)
austen_chapters[["tag_before"]] <- gsub("display:block", "display:none", austen_chapters[["tag_before"]])
sd <- crosstalk::SharedData$new(austen_chapters, ~chapter, group = "fulltext")
chapters_table <- data.frame(chapter = unique(austen_chapters$chapter))
chapters_table_sd <- crosstalk::SharedData$new(chapters_table, ~chapter, group = "fulltext")
y <- bscols(
# widths = c(NA,NA),
DT::datatable(
chapters_table_sd,
options = list(lengthChange = TRUE, pageLength = 8L, pagingType = "simple", dom = "tp"),
rownames = NULL, width = "100%", selection = "single"
),
fulltext(sd, width = "100%", box = TRUE)
)So this is the result.